Near Minimal Weighted Word Graphs for Post-processing Speech
نویسندگان
چکیده
Large vocabulary speech recognition applications can benefit from an efficient data structure for representing large numbers of acoustic hypotheses compactly. Word graphs or lattices generated by acoustic recognition engines are generally not compact and must be post-processed to keep lattice sizes small; however, algorithms designed for this task need to reduce the size of the lattice without either eliminating hypotheses or distorting their relative acoustic probabilities. In this paper, we will discuss the relevant criteria for measuring graph size, compare the advantages of two different structures for graphs, and introduce a new data structure and compression algorithm which give additional graph compression and maintain exact hypothesis path scores by storing probability information on both nodes and arcs within the graph.
منابع مشابه
Spectral tilt modelling with GMMs for intelligibility enhancement of narrowband telephone speech
In mobile communications, post-processing methods are used to improve the intelligibility of speech in adverse background noise conditions. In this study, post-processing based on modelling the Lombard effect is investigated. The study focuses on comparing different spectral envelope estimation methods together with Gaussian mixture modelling in order to change the spectral tilt of speech in a ...
متن کاملسایکوآکوستیک و درک گفتار در افراد مبتلا به نوروپاتی شنوایی و افراد طبیعی
Background: The main result of hearing impairment is reduction of speech perception. Patient with auditory neuropathy can hear but they can not understand. Their difficulties have been traced to timing related deficits, revealing the importance of the neural encoding of timing cues for understanding speech. Objective: In the present study psychoacoustic perception (minimal noticeable differen...
متن کاملEdit-Distance Of Weighted Automata: General Definitions And Algorithms
The problem of computing the similarity between two sequences arises in many areas such as computational biology and natural language processing. A common measure of the similarity of two strings is their edit-distance, that is the minimal cost of a series of symbol insertions, deletions, or substitutions transforming one string into the other. In several applications such as speech recognition...
متن کاملComparative Effect of Visual and Auditory Teaching Techniques on Retention of Word Stress patterns: A Case Study of English as a Foreign Language Curriculum in Iran
This study aimed at investigating the effect of visual (Cuisenaire Rods) and auditory nonsensical monosyllables using Pratt speech processing software as teaching techniques on retention of word stress. To this end, 60 high school participants made the two experimental groups of the study each having 30 students on the basis of their proficiency scores on KET (Key English Test). In one experime...
متن کاملThe effect of pruning and compression on graphical representations of the output of a speech recognizer
Large vocabulary continuous speech recognition can benefit from an efficient data structure for representing a large number of acoustic hypotheses compactly. Word graphs or lattices have been chosen as such an efficient interface between acoustic recognition engines and subsequent language processing modules. This paper first investigates the effect of pruning during acoustic decoding on the qu...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1999